# Kristina Botyriute

# **Access to Online Resources**

A Guide for the Modern Librarian

Access to Online Resources

Kristina Botyriute

# Access to Online Resources

A Guide for the Modern Librarian

Kristina Botyriute Open Athens, Eduserv Bath, UK

Photographs by Danielle Mac Innes, Edward Borton, Phil Coffman, Kristina Botyriute, Kai Oberhäuser, Pavan Trikutam, Angelika Levshakova, Philipp Berndt, Antonina Bukowska, Riciardus, Jakob Owens, Margarida C Silva, Clem Onojeghuo, Michał Parzuchowski, Daria Nepriakhina, Anastasia Petrova, Antonio Lapa, Tim Gouw, Marc Wieland, rawpixel.com, Jessica Furtney, David Marcu and Hand drawn illustrations by leva Botyriute

ISBN 978-3-319-73989-2 ISBN 978-3-319-73990-8 (eBook) https://doi.org/10.1007/978-3-319-73990-8

Library of Congress Control Number: 2018935111

© The Editor(s) (if applicable) and The Author(s) 2018. This book is an open access publication.

**Open Access** This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Cover Illustration: Front cover photograph by Ashley Batz: Back cover photograph by Jill Heyer

Printed on acid-free paper

This Springer imprint is published by the registered company Springer International Publishing AG part of Springer Nature. The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

Helping you get the most out of life by helping you get the most out of technology.

Eduserv

**01** Introduction

Authentication

Ben

start

HTIP Basic Authentication

jane and

Before we

Web based

What is HTIP?

D1gest

More about cookies .

HTIP

HTTP(S)

Cook1es!

King

Key

and

authentication

Authentication

..

Authorisation

02

03

IP address recognition On and off site Remote access: local build vs cloud based Secu rity considerations Key concepts 

**OS** 

#### SAML

Key

How it works ... You ... ... and Them Federation Key concepts

> **06** OpeniD Connect Open Authorisation 2.0 Open iD Connect

> > concepts

**07** 

Basic Troubleshooting

Forms AuthenticatiOn 60 second diagnostics Setting up access The fastest way to get help and bishop: certificates The End concepts Bibliography

#### "Access management is a very comp licated beast",

concluded one of my customers at the end of a lengthy support call. This might indeed renect how many librarians feel these days but it doesn't need to be! After reading this book, you will be able to skillfully navigate the maze of online access management technologies and decide what serves your library's needs best.

According to Gartner IT Glossary (2012), "identity and access management (lAM) is the security discipline that enabled the right individuals to access the right resources at the right times for the right reasons ." Simply put, it is making sure your users are who they say they are and on ly have access to what you want them to have access to. In addition to preventing u nau thorised parties from exploiting your organisation 's resources, lAM technologies can help manage subscriptions to online resources where cost is based on the number of users accessing protected content.

Some publishers charge for every single user, in which case you will want to make sure you have an up-to-date list of individuals who need this resource as well as ensure appropriate permissions are in place. This is particularly relevant to small libraries where the budget is lim ited.

As an international technical pre-sales consultant for Ope nAthens. I frequently speak to librarians from all over the world. The shee r number of techno logies a typical librarian deals with on a daily basis is astonishing. Often they are expected to learn-on -the-job, w hich can be stressful in a busy environment especially if communication between the library and IT department is poor.

The following chapters are written for know ledge workers who are invo lved w ith managing access to digital content online and cannot afford the time to read book after book of technical material to make sense of all the nuts and bolts that make up the lAM. I have covered all the main concepts in this book.

#### jane and Ben

Monday morning. Electronic resources librarian jane makes herself a cup of coffee. sits down at her desk and types in her use rna me and passwo rd into the login screen. Instantly. the computer sends these credentials to a central place- the directory, where all organisational accoun ts are listed. The most popu lar of these is Microsoft's Active Directory but on a rare occasion you may be dealing with alternatives such as OpenLDAP, Univentions (UCS), ApacheDS or even the futuristic concept of Directory-as-a-Service.

So what happens when jane 's credentials reach the directory? The server checks if jane is a registered u ser and if the password is correct. If so, she is authenticated into the system.

jane ope ns the shared drive to find some reports but accidentally clicks on the 'HR ' icon, causing a warning messag e to appear advis ing she does not have permission to access to this folder. She then clicks on the 'Reports' as initially intended and it opens. This is an authorisation decision in fluenced by a variety of security policies in j ane's organisation. determining specific permissions for each user or user group.

In the context of accessing digital resources online. authentication and authorisation may occur a number of times before users are presented with the content they are trying to access.

Ben is a chemistry student who has found an interesting article on ScienceDirect (scienced irect.com). In order to read full article, Ben must sign into the website. He knows his university has access to content on this website and selects the 'Sign in via your Institution' option. The following sequence of events may sound like a long intricate process but in reality it gets executed in a split second:

First. a form for credentials is displayed and as soon as Ben enters his details. his organisation authenticates him as a valid user.

Then. Ben 's institution passes a small set of information to ScienceDirect. This set includes details about Ben as well as his univer sity and is used by the publisher to carry out au thentication against the list of subscribing organisations. We can think of it as a second roun d of the same process, only now on provider 's end.

Lastly. university is ver ified to have a valid subscription and authentication is successful, however the article of interest is published in a journal his institution has not yet bought the access to and the authorisation fails.

Ben sets off to his un iver sity's library to discuss his options ...

## POINTS

Authentication v alidates user's ide ntity. Who are you?

Authorisation checks what permissions the user has. What can you access?

Before we go ahead, we need to make friends with one concept. A PRO TOCOL is a big scary word, often used by IT guys to scare people off so they do n't hav e to wor k as much (I am jo king. of course). My personal. if somew hat geeky. opin ion is that everything bo ils down to a protocol. I wi ll explain.

## A Shopping Protocol. ASP.

One must walk into a shop. collect items into a trolley or a shopping basket and either self-checkout or go to the till to pay. Wh ilst there may be va riation in customers' choice of items' container and the method o f checkou t. ultim ately the procedu re is to collect items, pay and leave. Any other way to obtain goods from the shop is non-standard and usually unsupported by law.

Essentially. a protocol is a set of rules des igned to make ou r life easier. The sequence of events may vary in length and execution depen ding on who is doing the shopping b u t the rules o f the protocol enable a clear goal. path and outcome.

What about online shopping? Well, this wou ld be ASP 2.0 Important thing to note though is that a higher version of something doe s not always guarantee an improvement- somet imes it is j ust another way of achieving the same result.

With that in mind, let's go ahead and explore the mos t common authentication and authorisation methods that protect the digital content online today.

# What is HTTP?

HTIP stands for Hyper Text Transfer Protocol.

It is *a* set of rules of transferring files on the World Wide Web. When you open your b rowser and type in an add ress, you are really saying: 'GET me this web page!'. Collaborating w ith *a*  number of other protocols, HTIP fetches you the page and serves it up on the screen.

GET https:/lwww.google.com/search?q =test >

Requesting information is not the only thing you can do with this useful protocol. Whilst there is no need to explore all nine methods of HTIP. we w ill look at another popular one· POST. What do es it do? Exactly what it says· it allows you to send informa tion. The link in your browser is the address on an enve lope and the 'letter' with information is enclosed within.

POST https·ttwww.any\_internet\_store.com/Logi n > logoniD: username logonPassword: password

Web based authentication has ma ny flavo u rs and what we know as 'username an d password' uses three of them:

There is *a* lot more to this sim ple me thod than meets the eye and we will delve right into what happens behind the scenes.

Hyper Text Transfer Protocol (H TIP) facilitates communication of data on the Wo rld Wide Web

GE T is a way to request data

POST is a way to subm it data

HTIP Basic Authentication is the oldest username an d password authentication method there is. It dates back to 1989, when Sir Tim Berners-Lee invented the World Wide We b . It works like this: a user types in credentials and from then on they must be passed to the website each time the user's actions result in a req uest for any new content to be d isplayed. Remembe r GET? This is it! When content is protected by Basic Authen tication whenever the user clicks to open a new article, types in a search query or navigates to a different area of the we bsite credentials will have to be included in that request. Here is how this might look like :

http:/u sername:pa ssword @www.example.com http:/examp le.com?un= u sername&psw =password

This could get quite inconvenient if one was forced to type their username and password ove r and ove r again . Instead of promp ting for login every other click, the web browser takes care of this by he lpfully storing use r's credentials until a logout button gets hit or the web browser win d ow is closed.

Need less to say. due to it's age HTIP Basic Authen tication has major security flaws. As you have already noticed, the example links on the left are passing the username and password in clear text. This au the n tication method supports base64 encod ing too b u t it doesn't make it mo re secure as the text can be decoded in seconds using online tools. Can you guess w hat is encoded in this link?

https:/example.com?un = dXN icm5hbW U=&psw =cGFz c3dv cmQ =

( If you can't go to base64decode .org and copy-paste the values in bold.) Although most digital publishers opt for more secure me thods to protect their con tent, some still support Basic Authentication. Reasons range from scarce d evelopme nt resources to faith in hu man ity. Fortunately for us, this method has a distinct pop-up login screen which w ill help you iden tify it-see next page for a real life examp le. Wh ilst I am not advocat ing the id ea. I have seen institutions negotiate lower subscription prices upon discovery of Basic Auth. Others have effectively encouraged their prov ider into implementing an alternative authenticatio n method.

This is a more secure version of HTIP Basic Authentication. From user's perspective everyth ing looks the same (real life examp le of BA. as promised):


The only difference with Digest Authentication is that the password will no longer be sent in clear or base64 encoded text. It is now encoded and hashed. What is a hash? Otherwise known as a message digest, a hash is a value representing the original string. For example: 'password' hashed in MDS is 'Sf4dcc3b5aa765d61 d8327deb882cf99 '

MDS (Message Digest 5) is the default algorithm used for HTIP Digest Au then tication. Problem? MDS can be cracked in a blink. hashkiller.co.uk cracked the above example in 104 milliseconds.

Upon a (hopefully brief) encounter with Digest Authentication. my best advice is to note what the creators themselves said about the method:

"The Digest Access Authentication scheme is not intended to be a complete answer to the need for security in the World Wide Web. This scheme prov ides no encryption of messag e content. The intent is simply to create an access authentication method that avoids the most serious flaws of Basic Authentication." (Leach et al., 1999)

## KEYPO INTS

HTIP Basic Authentication passes credentials w ithin the link. in clear or base64 encoded text

HTIP Digest Authentication hashes the password w ith MDS

base64 can be decoded using tools freely available on line

MDS is the default algorithm used for HTIP DA. This algorithm was first cracked in 1996 and considered u nsu itable for use since 201 0

This method submits user name and password to the server by power of POST. (Think of an enve lope with a letter inside). It does so in clear text. however it is most common ly used with HTIPS for added security. (Think of an envelope with a magic seal on top).

What is HTIPS? Hyper Text Transfer Protocol Secure. You know it's in use when you see this:

Forms authentication is incredibly popular and is the most widely adopted variant of username and password authentication. POST as a method is more secure than GET: it will never pass data in the address ba r, it will not be cached or remain in the browse r history. Still, it can be read if intercepted un less used in conjunction with HTIPS. To illustrate the process, I will attempt to access MAG Online Library. POST to: https:/www.magonlinelibrary.com/action/dologin

Username and Password do not match.

The result is an error message, as expected. Shou ld my cred entials have matched the records on publisher's end, the code on the we bsite would have changed to contain my username and password in the login form. This would then be used to redirect me to the post-login screen, print 'Hello, Test' and potentialy load my personal profile for this website.

HTIPS forms authentication is a much better way to co nne ct ind ivid ual users to protected co ntent than Basic or Digest Authentication. For one, the login form will look an d behave as d esired by the creator whi lst the other two leave us stuck with a pop·up box an d an ugly error 401 when things go south. Many publishers support forms authentication as an option for indivi du al subscribers wh ilst institutional users are often en couraged to use federated access, covered later in this book.

POST https:/www.example.com/auth.php- more secure than GET bu t data can be read if intercepted by man in the mid dle attack

POST https:/www.example.com/auth.php- most secure: credentials are encoded and therefore useless if captured.

# Cookies!

"By continuing to use this site you consent to the use of cook ies on your device as described in our cookie policy unless you have disabled them . You can change you r cookie settings at any tim e but parts of our site will n ot function correctly without them" (ft.com)

Also known as HTIP entity authentication, cookies are different from username and password driven recognition. Much like real cookies, digital ones also enhance the quality of life· or in particular, user expe rience on the web. As I'm sure you will agree, we would struggle to find a website that does no t make use of cookies in this day and age. So, what is this cookie?

A cookie is a small piece of text that stores information about your interaction with a website. If you clicked on the cook ie policy hyperlink in the notification displayed at the top of this page, you wou ld have been taken to one of the nicest cookie policy explanation pages I've come across so far. Not all publishers go into troub le of explaining themselves in such detail and therefore it is worth familiarising with how cookies work. According to Wr ight Freedman and Liu (2008) "in contradiction to the claim that no information is sent from your computer to anybody outside your system, the majority of cookies are interactive (that is, the information is no t only written to them but also read from them by the web servers you connect to)."

Sess ion cookies will 'go out of date' as soon as the browser is closed or the session time is up. This means that if my aunt Mary was shopping for groceries for her Sunday roast and had a cart full of goodies, one unfortunate click on the red X at the top of the browser would render her cart empty when she nav igates back to the site. Such an event wou ld likely cause her some grief and perhaps this is one of the reasons why session cookies are not overly popular amongst online retailers. What if the browser was set to purposely deny session cookies? My au nt Mary would not be able to add any potatoes to her cart at all! Websites do not have a memory of their own and so she would be treated as a new visitor every time she opened a different page.

Persistent cookies are either stored in "jars" o n your browser or on your device, in the hard drive. Being plain strings of text they cannot do anything on their own but are detectable by websites and serve as reminders of the vis itor's language preference, bookmarks or theme selection. On rare occasions cookies would store user's credentials which cou ld result in auto-login although from a security perspective this is not something that should be endorsed.

Cookies come in two flavours: persistent and session

## More about cookies ...

When a cookie is initially set, several very important parameters are specified: coo kie's name. expiry date. doma in, session identifier and path.

NAME: Chocolate Chip Cook ie EXPIRY DATE: 03/2020 BRAND: Cook ie Company SESS ION: first shopping today PATH: 3rd isle from the left

There are others, such as a secure parameter, bu t they aren't always used. Let's take a look at how the cookie is set upon clicking 'Accept and Close' when vis iting natu re.com:

A to you r wtth our and I Morf' mfo.

#### POST

cookies: accepced

Set- Cookie: euCoo ki eNo cice=accepced; domain=ww-w. nacu re . cam; pach-/ ; 02 Jul 2 018 16 : 31 : 07 - 0000;

Looks technical? Here's what it all means.

euCookieNotice=accepted : acknowledges my acceptance of cookies doma in=www .nature.com; means the cook ie will on ly be va lid here If nature.com had any sub- d oma ins, such as 'xyz.nature.com' then *a*  separate cookie wou ld have to be set for those How wo uld we set a coo kie to includ e all sub -doma ins? '.n ature. com'

path=/ ; m ea ns the cookie will apply to all pages on this domain, not j ust this particular one

expires=Mon, 02Jul 20 18 16 :31 :07 -0000 ; sets cook ie's lifetime to a year

As you will have already noticed, there is no session identifier. This means the coo kie we've just analysed is not a session one. To check, simply close the browser and re-open again - did you see the cookie message appear at the top? j ust for fun, I checked what else was set on my browser as soon as I got to th e website. The list turned out to be quite exten sive, containing bo th session and persistent cookies (yes, all of those folde rs, not j ust nature.com):


Cookies can significantly enhance user ex perience and some use of them is essential. Presenting users w ith *a* message that sign ifies acceptance o f all cookies on the site is requ ired by law in many countries.

Clear your cache and cook ies if bothered by unsolicited ads (or install an advert blocking extension).

Check the cookie policy if not presented w ith informational message - it is good fun and good practice to know who is interested in you r activity online

I am yet to see an online content publisher who would insist on this form o f authentication. It is useful to know neverthe less as you may be using certificates to access Office 365, protect connection to your work network over the VPN or even just log in to the portal where all of your digital resources are listed. Certificate authentication can replace user credentials or be used in conjunction for increased security. Winnard et al. (2016) defined the concept in the following way: "one party uses a certificate to identify itself, the other party must valida te it. This process is referred to as a handshake."

At the risk of sounding medieva l when explaining modern technology, I will compare a d igital certificate to an official seal. confirming to the King the letter is from the bishop. The b ishop will hav e used his ring to stamp it. then ordered his trusted messenger to deliver the letter to the King. This letter is of high importance and the King needs to be certain that the seal is not forged. What if someone has stolen the bishop's ring and went on stamping about? He refers the matter to the archbishop (Certificate Authority)· *a* highly respected and trusted individua l who is in charge of and regularly keeps in touch with all the bishops. The archbishop inspects the seal and confirms it's validity. He also informs the King the sender is alive and we ll. as he has only recently attended a dinner party with him.

The King is now sufficiently assured of the authenticity of this letter and proceeds to read it.

Suppose the bishop has been demoted · he would then be added to the revocation list and the archbishop would advise the King to not trust any correspondence sealed with the demoted bishop's stamp. The same would apply if the bishop's reign in the region has come to an end (this would unfortunately mean the bishop has passed away) · the archbishop would notify the King the official seal has expired and shou ld not be trusted.

When you are a King. here is how your browser would declare it:

IP address recognition. often referred to as a "traditiona l authentication method", is very o ld. It pre-dates the HTIP Basic Authentication d iscussed earlier on and goes as far back as 1970s • the time before the World Wide Web as we know it. Why d id I call it recognition. no t authentication? Because the elements required to identify an individual are missing. It deals with authorisation on ly and works by checking whe ther the traffic is coming from a known location. For examp le: Ray wants to access the International journal of Metrology and Quality Engineering. His institution subscribes to it and Ray is accessing from an on-campus computer . Upon detecting *a* new connection , metrology-journal.org che cks Ray's IP add ress against the list of authorised IP addresses and grants access to the content.

IP recogn ition is w ithout a doubt the most wide ly used method for institutionallogins in the o nline publishing industry. This is *a* very convenient option that requires min imal efforto set up- *a* simple network firewall can do the job. Here is another common scenario: *a*  un iversity is pu rchasing subscription to an online resource, such as Annals of Internal Medicine. The range of un iversity's IP addresses is specified on the o rder form, the pu b lisher adds them into the entitlements ' system (or a firewa ll access list) an d job done !

The traffic For each incom ing IP is likely to be monitored For security reasons and to measure usage wh ich may influence the cost when it comes to renewal. The setup itself though is exceptionally straightforward. But how do we use the same method to enable access for use rs off-site?

The reigning king of IP-based remote access technologies is a proxy server. Let's use *a* med ical student. Helen, to illustrate how this works. The deadline is fast approaching and Helen needs to access annals.org from home to complete her assignment. She logs into the library portal where links to var ious on line resources are listed and clicks on 'Annals of Internal Medicine' link wh ich is configured to route the request via her university's proxy server. The proxy changes Helen 's IP address into one that has been pre-agreed to represen t this institution and the publisher authorises access based on the proxy IP instead of He len 's real one .

Some organisations like to keep it all in-house. in particular those benefiting from a large IT team or those that do not believe in cloud technologies. A proxy server is either installed as a stand-alone entity on the local netwo rk or may come as an add-on feature of another lAM technology. such as OpenAthens LA. In such a setup, the organisation takes full responsibility for the maintenance of it's own proxy server· patching. upgrades. resilient architecture. everything. When strict security policies must be adhered to but the institution still wishes to utilise IP recognition for remote access this is often a good choice. Some providers charge per traffic volume or limit number of concurrent sessions. In response to that, some IT teams feel that having a proxy server on -site helps them mai ntain a better grip on usage management. EZProxy is an example of a proxy well-known to academic libraries. It offers two options- locally installed EZProxy server or Online Computer Library Center (OCLC) hosted serv ice. Whilst ideas to create an open source alternative are surfacing due to the observed continuous rise in prices for this service (Sabol, 2016), the on ly real alternatives today are Web Access Management (WAM) proxy or OpenAthens, where a managed proxy serv ice is part of the package .

Hosted proxy services take a lot of stress away as the provider takes care of all the upgrades, maintenance and guarantees a high uptime of the service. As with everything. migration from a local installation to hosted serv ice requires careful planning. Lynne Edgar from Texas Tech University (TTU) libraries (2015) has shared the experience of migration in the jou rnal of Electronic Resources Librarianship, making the following recommendation: "I suggest other libraries thoroughly unde rstand their authentication process < ... >when implementing a hosted service. < ... > Be sure to ascertain the process used to access resources via mobile devices when moving to hosted EZProxy. Ensure tablets and phones will be able to access all of your electron ic resources formats whether users are on or off campus".

Her recommendation to thoroughly understand local authentication process is sound and applicable whichever lAM solution you may be co nsidering. If you know what systems are in place and what your user journey looks like, a good support team should be able to assist you with the rest. In TTU Libraries' case, the process of migration has un intentionally stretched out to seven months and there was *a* loss of service to external patrons along the way.

## @='KEY POINTS

A locally hosted proxy server will have to be looked after. Organ isations that have implemented this solution commonly have *a*  dedicated member of staff who continuo u sly updates proxy configurations.

Proxy in the cloud takes a lot of work off your hands and is much more convenient than a locally hosted one . Unde rstand ing of your institution's security policies as well as existing user journey will help reduce disruptions during the impleme ntation.

"On average, 58% of the I P ranges held by publishers to authenticate libraries who license their content are inaccurate"

PUBLISHER SOLUTIONS INTERNATIONAL, 2017

As convenient as it may be, IP recognition has it's Haws. Many pub lishers code their websites in such a manner as to aid the researchers in their efforts. This aid would often take form of personalisation features, such as ability to save useful articles or advanced search quer ies. compile a list of references, share material w ith fellow researchers and so on. All of this convenience is unattainable when IP address is used for authorisation. Why? Because the IP address does not uniquely identify a user, un less the user has a static address configured on the device and that dev ice is utilised exclusive ly by that one user which is a somewhat unlikely scenario. In fact. it is common practice to only use one o r two IP addresses to identify the whole site! The most a digital content prov ider can achieve is match the incoming IP address to the list of subscribers and make a remark of this somewhere on the website, such as "This resource is provi de d to you courtesy of Helen's University".

Something to consider: networking reams rarely discuss their work w ith the library (no r would librarians find it interesting). So whenever institution's externaiiP address changes, the library would be informed of the new one and the old one would be left to function for a while to avoid any disruptions. How often do we bother to contact all the publishers to remove the old IP address? My experience shows this is not a common practice as many subscribers get misrecognized every other day and contact our serv ice desk for help.

In addition to being susceptible to man-in-the-middle attacks, access by IP recognition has been discovered to suffer from general abuse by subscribers. Publisher Solutions International. ltd (2017) have recently carried out an extensive research and data cleanup exercise where they have come across numerous instances of misuse and license abuse ... This lead to opening of the ipregistry.org - a growing repository of approximately 1.5 billion validated IP addresses from ove r 60,000 organisations worldwide . These addresses are added and upda ted by subscribing institutions themselves, however the benefit is that they on ly have to do this once. Participating publishers are keeping an eye on this list and upon detecting changes on their subscribers' records, update their access management systems automatically.

The site has j ust gone live but has already been enthusiastically greeted by large pub lishers such as Wiley and Camb ridge Unive rsity Press as well as librarians in the hope they w ill be able to cut down on ma nual effort required to update every provider every time one of their on-site or proxy IP addresses change.

## £f€r KEY POINT

IP recognition is easy to implement and is somet imes perceive d as the key element to guarantee anonymity. It is also a trade-off be tween convenience for the library and convenience for the end user.

## Key concepts

"While it may seem like no one is paying attention, internet users are starting to realize their data has value. And it's a value that deserves better than a password."

JOHN FONTANA, 2017

Security Assertion Markup Language- SAML ( sam-el) is a well• established an d mature open standard, designed for the best possible use r experience with the added benefit of maxim um secu rity. Praised by information security professionals, it passes selective information abou t an indivi dual w ithout ever giving out user's credentials! Better yet. one of the main purposes o f this protocol is to aid Single Sign On which takes care of the headache associated with maintaining passwords. Sounds magical? Let's have a look at how it works.

An engineering studen t Ed wants to wa tch a v ideo on the IET.tv website. To gain access, he needs to login via his institution or register as an individua l sub scriber and pay the fee. Ed selects the 'Federa tion Login' option. selects to login via UK Federation, picks his institution from the list. lET then forwards him to his unv iersity's login page so he may authenticate himself. The username and password are accepted and the university replies directly to the publisher with requested information abo ut this student, confirming he belongs to the institution and is en titled to access this resource. The pub lisher checks the response contains what they n ee d to make an authorisation decision and if everything matches up - Ed is granted access to the video of his interest. Happy days!

Consider the following picture illustrating a similar scenario:

Although implementa tion of SAML requ ires a little more effort on publi\$her'\$ end th<ln HTIP 6<!\$ic; Authentic;<!tion or IP rec;ogn ition, it dOe\$ pay for itself and is therefore becoming increasingly popular. especially where digital content is of high value. Giant publishing houses such as McGraw-Hill, Oxford University Press and Elsevier were among the first to adopt SAML authentication for institutional subscribers.

KEY POINT SAML authentication does not expose users credentials, valida ting access based on the selective information passed in the background instead.

The decision to trust some one is often made based on what you know abo ut that person . Trust is the key principle of SAML and like in real life. identity plays a major part. Similar to a country issuing passports to it's citizens, you- as an institution-are providing virtual identities to your users. Depending on your security and data protection policies. you will be collecting certain information about them, such as name and surname, email address. position. maybe even home address. telephone number, date of birth and the shoe size! This helps create an accurate user profile, stamp it w ith a u niq ue username and assign appropriate permissions and privileges for each individua l. In the world of SAML, your country is called an ldP • Identity Prov ider. This is very important! The identity provider is you.

Now that you have a cou ntry to rule, you need a country code. Whilst you woul d expect one to three d igits in a normal world. Identity Providers are defined by a unique string of characters that often look like a web address but isn't Ou st to confuse you} . It's called an entity! D. For example: "https:l/idp.adamscollege.ed u/en tity" might identify Adams College. An important thing to remember is that it doesn' t do anything- if you clicked on it. it wou ldn 't take you anywhe re. So why the weird notation? Well .. .for one, 'sfghhjkd 1334' is not as easy on the eye although it could serve the pu rpose just fine.

EntityiD is quite an important eleme nt- much like a country code, it can make or break the connection. As such, I often get asked "what happens to the entityiD upon switching from one software to another?" The answer is· nothing needs to happen unless you choose so. Yo u may decide to keep it exactly the same and users w ill not know the difference or change it to match the n ew software. Changing the entityiD will require appropriate notificationsent to you r users as well as online content providers.

Ok, last bit. Your population has grown and you now have more than one city. If you are Spain. how do we help route call to Madrid and not Zaragoza? We use a city code or scope "madrid.es". Here's how this would look like in a SAML call directory:

EntityiD: https:/idp.espana.es/metadata Scope: madrid.es. zaragoza.es. barcelona.es, va lencia.es, seville.es, palma.es And if you happen to be Monaco? EntitylD: https:/idp.monaco.mc/me tadata Scope: monaco.mc

## a:tf' KEY POINTS

Identity Provider or ldP creates virtual iden tities for users. Institutions use various software products for this task: Shibboleth, OpenAthens, ADFS, etc

EntityiD uni quely iden tifies each Identity Provider

Scope is the 'perimete r' of where the user is coming from. For example: "maincampus. un ivers ity.com", "overseas.un iversity.com "

Service Prov iders are the o ther half of the SAML equation. Most commonly you will know them as digital content publishers (IEEE, MAG Online Library, Science Direct) but a service prov ide r can be anyone enabling their login with this protocol. Blackboard, Moodie. Canvas, EBSCO Discovery , Alma, Office 365. Lynda.com, Google all support SAML for Single Sign On and authentication purposes.

How do publishers recognise their subscribers? They do this by analysing an attributes statement sent to them by the Identity Provider. This statement, called SAML assertion, contains information about the instiwtion and an individual user, based on what you have decided to release. Consider the following scenario: Anna is a physicist from the USA who will be spend ing few weeks in Sw itzerland. collaborating with CERN scientists. In addition to an invitation letter. she must produce evidence of her identity and education to obtain her temporary researcher's pass.

When accessing online resources, authorisation decisions are made in a similar manner: the publisher matches your attributes statement to a certain checklist and if conditions are met, access will be granted. If not - den ied.

#### The be low is an exce rpt from o ne such attributes statement:

**<•aaJ. M.u.e-••ur-n : o1d: 1 . 3 . 6.1.4 . 1.5923 . 1.1.1. 9 • :n&&e:s: cc: S.A."fi. :4'. 0 :att:naxe•tonc.at: :url.** • > **<aa:l.: At. ne't</ aa=l: At. </sozU :Att.n.bl.tte> H.azo.e!'orw.at••ur-n :oesis :n.&Jr.es: tc :.51<.."0.: <sup>2</sup> . :u r1•**  > < .. **.:al :At-tr ib:JttVal•..:e>t r-i.st ina . botyriuteled.userv. o:q.** uk< / **ta:ll: At::: UnlttVelue>**  <laaal :A.u.rl.bU.t.e>

This is XML so it doesn't look pretty but I bet you can still make out my work email address and member va lue for scope "ps.openathens.net" As digital privacy is one of the major concerns today. your ldP software should allow you to fine-tune any user related attributes you wish to release or withhold. Such fine-tuning can he lp achieve the magic combination of security, anonimity and personalisation all at the same time.

Great! We now know about ldP. SP. entityiD. scope and attributes-just unde rstanding this terminology can he lp look good in a technical conversation. The key to it all however , the glue that makes it all wor k is the metadata. Metadata is information about information. Or data about data. Not just any data though- a decriptive one. Any SAML participant has a metadata file that contains their entityiD, scope, attributes, login endpoints and other relevant things. As mentioned before. the key concept of SAML is mutua l trust and it can be established by exchanging the metadata.

### POINTS

Service Prov ide r means anyone that relies on SAML attributestatement to make authorisation decisions

A metadata is a descriptive file defining each SAML participant an d provid ing the necessary information to establish mutual trust

A federation is a collective of ldPs and SPs that have agreed to trust each other. Remember the meta data from the previous page? One of the rules that define trust and interaction in the federation is an aggregation of information about all parties into a large XML file. This is where Identity Providers and Service Providers wou ld enlist their meta data files to make the secure communication easier. I have come to think of it as a private scientists' party as most federations were established to unite educational bodies of each country. Each has it's own rules of acceptance: to join The UK Access Management Federation for Educa tion and Research the ldP organisation must be an educational or research body based in the United Kingdom. lnCommon accepts members from the US higher education, resea rch organisations, or sponsored partners of higher education members. Most federations have geographical restrictions with OpenAthens currently being the only global federation that is not limited to academic institutions (but we could see that change). At the time of writing there are 51 live federations known to REFEDS - the Research and Education Federations group. with further 16 more in a pilot stage.

Federations vary in size and affordability. For example, membership in UK Federation is free whilst AAF - Australian Access Federation charges \$8436 joining fee plus \$8581 per annum (Aaf.edu.au. 2017).

Finnish Haka federation comprises of SO mem bers whilst lnCommon in the USA boasts a growing community of 944 participants (lncommon. o rg, 2017). Due to geograph ical restrictions however. you may no t have much choice unless you live in Texas, USA. Texas has three federations of it's own and is eligible to jo in lnCommo n as well as the OpenAthens federation . So why would you want to join a federation? Why not just go ahead and create a bunch of one ·to·one connections?

First, this would be too cumbersome for everyone involved. It is much easier for a service provider to retrieve records from a big file on the web (or a local copy of this file-even faster!) than to create an in-house records' system to store each organisation's meta data. Furthermore, such a system would have to be continuously updated in case the Identity Provider chances something• a login point for example. For you as an institution the benefits include hav ing all the information about your providers in one place and security assurances. You can expect a certain standard of service througho ut the federation and depend ing on the ldP software in use, completely eliminate the need to involve your technical staff when enabling access to online resources.

KEY POINT joining a federation can dramatically reduce the effort req u ired to connect users to your digital subscriptions

PLOT O"'S 

•

•

PROOUC I O N =E O E R ATI0\5

REFEDS , 2017

Open Authorisation (OAuth) is SAML's little sister. It's latest version• OAuth 2.0 was released in May 2010 and is yet to fulfil it's potential though it is fast gaining popularity among mobile application developers . An important observation to make- as the name suggests, OAuth deals with authorisation. not authentication as it is designed to he lp one application access another application's data. You may be familiar with this:

You may have also seen similar prompts when downloading applications from Google Play o r Apple's App Store. As part of the authorisation framework. the application will ask for your permission to access your data from another application. This would sometimes only happen once and other times you would be prompted more frequently.

After clicking 'Authorize' or 'Allow ', the app that popped the q u estion

will send an authorisation code to the app that requested access. In our example it will be ORCID granting access to your data to Scopus. The authorisation code can be compared to a bank cheque - on it's own it's a worthless piece of paper but when you take it to the bank you may exchange it to real money. Some cheques are va lid for a month, three or six months but authorisation code's lifetime is normally minutes and seconds. So the receiver must go and cash it in quick to obtain the access token (money) in return. This access token will allow it to go to the shop - ORC ID - and access information about the u ser for a certain period of time - ie shop until the money runs out! Somet imes money runs out really quick but some apps are more generous than others and write big cheques. Facebook, for instance. will allow apps to access your data for 60 days.

The process is simple, so not surprisingly the protocol was well-received and quickly adopted. It was soon noticed however that OAuth 2.0 was be ing misused for authentication wh ich it was ne ve r designed to perform. A range of security issues were discovered, most of wh ich are now well documented and available on the World Wide Web. The famous "Signing into One Billion Mobile App Accounts Effortlessly with OAuth 2.0 " by Yang, Lau and Uu (2017) is an astonishing examp le of our incline to trust techno logy and perhaps a nu dge to nurture our inq uisitive natu re a little bit more.

In 2014. a self-proclaimed "league of backstabbing competitors" (Leszcz. 2017) developed OpeniD Connect. also known as OIDC- a protocol that adds an authentication layer on top of OAuth 2.0. mak ing it more secure as well as facilitating superior user experience. The protocol was first adopted by it's creators: Google, Microsoft and Ping Identity, then by other technology giants such as Amazon. IBM. Forge Rock and PayPal. Big names sound encouraging but what does it actually do and why would you want to know about it?

Although current library technologies are in no imminent danger to be taken over by OpeniD Connect implementations, it is rapidly gaining audience and if all goes we ll it might just replace SAML in a decade or so. You may already be using applications that promote this authentication method. for example. to access My Day by Collabco. Moodie. Office365 or Open edX. There is also another reason why I want you to know about OIDC. When choosing between two VLE systems or two student platforms or even between several access options when subscribing to an online resou rce. the one that supports Open iD Connect shou ld win against the one that on ly does OAuth. OAuth 1.0 or OAuth 2.0

Even if it's j ust from security perspective; even if just for you .

Remember the access token -rea l money - that Scopus used to access your data from ORCID? In a scenario where only OAuth 2.0 is used, Scopu s has no way of knowing whether you are stillogged into ORCID so it can keep on shopping until the money runs out (access token expires). When Open iD Connect is at play however . Scopus would receive an ID token together with the access token. In other words, a photocopy of your passport in addition to money. In addition to useful personal information such as name and surname which will help the app provide a better service, the photocopy w ill contain a time stamp allow ing it's validity to expire as well as proof that you are definitely logged in. ID tokens can be signed. encrypted and otherwise secured to a high standard which is another great feature of OpeniD Connect.

#### !@=> KEY POINTS

OAuth 2 .0 deals with authorisation only, OpeniD Connect adds an identity layer to it making secure authentication possible.

Think "app to app" communication rather than "app to user" or "user to provider". Implemen tation of this authentication method will normally require some deve lopment effort.

# Key concepts

# 60 second diagnostics

Resource access issues can sometimes be caused by an incomplete setup. If you have used the "60 seconds diagnostics" flowchart and ended up on "Contact the publisher" suggestion, this is probably why. Let's have a look at what providers need from you to successfully enable access for your organ isation.

#### Access by ... username and password.

Avoid if possible. Nothing is required from you to set this up: the publisher will provide you credentials that you will be asked to share within your institution and users will take it from there.

#### Access via ... IP recognition.

Send the publisher the range of your external on-site IP addresses. If you are using proxy to facilitate remo te access, add your proxy IP as well, advising that this is a proxy IP (they will see much more traffic from this address and may decide to block it if not notified otherwise}. When providing on-site IP addresses, make sure they do not start with 10.\*,172.16.\* to 172.31.\* or 192.168 .\* as these addresses are private, mea n t for internal use only. Your networking team will have set up a translation protocol that turns these internal addresses into one or more external IPs which is what the publisher wi ll be interested in.

Access via ... SAML authentication.

If your institution be longs to a SAML federation. providers will probably only requ ire your entityiD and scope to enable access. Very few would ask for particular attributes· such as emai l address or a specific string of characters to be passed to them as part of the attributes statement. One thing to bear in mind though (this comes up very often): pub lishers will often refer to federated access as "Shibboleth". Shibboleth is a popular open source software used to aid SAM L au thentication which many digital content providers are familiar with. It was so popu lar in the ea rly days of SAML that the name became synonymous w ith it and funny enough. some would have never heard of the protocol but wou ld recognize the sound of Shibboleth. Don 't let this confuse you ·whoever supports Shibboleth will be capable of setting up SAML authentication for you.

If you are looking to make one -to-one SAML connection to an application such as Moodie or Blackboard, instructions w ill usually be provided. If in doubt. the principle is the same as with the federated access· metadata exchange. You will need to provide your metadata file to the requesting party and obtain theirs. then add theirs to your system and they will add yours. job done!

Access disrupted. phone is ringing off the hook wh ile the service desk people on the other end (publisher. software vendor. IT team) are taking their time? Very stressfu l, very frustrating and it's not your fault! Having had the priv ilege to be in the role of the outraged customer representing institutional interests as well as a support analyst for such outraged customers I have observed few things that he lp speed up the resolu tion time- every time.

1. Try to identify the root cause of the issue if at all possible. Use the flowchart from "60 seconds diagn ostics" to get an idea o f what may have gone wrong. This step will either save you a lot of time or at the very least reduce the likelihood of hearing it's someone else's problem.

2. Pick up the phone . Really. This is an obvious one but you would be surprised how rarely peop le do it! If you are looking for quick results. opt for a call rather than email. I will agree with you if you have just thought to yourself it is impossible to find online publisher's help desk n umbers. Online forms and email addresses that send automatic "we will get back to you within the next 24 hours to 5 working days" replies makes their life easier, helps manage the workload and so on. However if your institution has go t an aud it in the next few hours or access to the resource you have based your presen tation on is not working ... I call it mission critical.

Can't find the number for the help desk? Call their sales team or if you have one - your sales representative. I guarantee they will pass you through to the technical team or get them to call you back. (Sound distressed!)

3. Email screens hots and steps to reproduce the issue. This is just as essential as getting help desk 's attention in the first place. Un less you are affected by a service-w ide issue o r it's a well-known bug. the techn ical team will not know precisely what is wrong. One thing I have learnt is that there are million ways to get to the same error message. Tell them exactly what you clicked on. w here it took you and attach the screenshot of the error message that followed. If at all possible, prov ide test credentials.

4. Confirm the person dea ling with your issue. A name and help desk 's number is a great start· sometimes jus t knowing your special helper's name inspires greater responsibility. If all else fails, you can at least encourage accountability.

On the other end of the scale are super-helpful workers who will not hesitate to provide you with their personal work email address or d irect dial. This is amaz ing when dea ling with an ongo ing emergency. however if you want this special attention when the next disaster strikes. better not put the poor guy on speed dial for not so urgent issu es.

#### You've made it!

I sincerely thank you for your time. The world of identity and access management is vast and growing fast but so little of it affects how we access online resources today. I am excited to see new technologies seep into the library and enrich the way people experience knowledge.

With promising projects we ll under way we may finally be able to comb ine security with usability. Librarians are getting very savy working with all the different, sometimes even incompatible, systems they are presented w ith. I hope this won 't be necessary for long.

Lastly, I hope this short read will have made your access management less of a maze and more *a* walk in the park.

Yours truly,

Kristina

# Bibliography

Aaf.edu.au. (2017). Australian Access Federation. [onl ine] Ava ilable at: https:/aaf.edu.au/price [Accessed 10 jul. 2017].

Edgar, L. (2015). EZproxy: Migrating From a Local Server to a Hosted Environment. journa l of Electronic Resources Librarianship, 27(3), pp.194-199.

Fontana, j. (2017). Hacks battered IT optim ism in 2016; can 2017 enrich defenses 1 ZDNet. [on line] ZDNet. Ava ilable at: http:/www.zdnet.com/article/hacks-battered-it-optimism-in-2016-can-2017-enrich-defenses [Accessed 9 jul. 2017].

Ft.com. (2017). Financial Times. [online] Avai lable at: https:/ft.com [Accessed 9 jul. 2017].

Gartner IT G lossary. (2017). Identity Management- Access Management- Gartner Research. [online) Availab le at: https:/research.gartner.com/definition-whatis-identity-access• management [Accessed 11 ju l. 2017].

lncommon .org. (2017).1nCommon Participants. [online] Ava ilable at: https:/www.incommon.org/partic ipants [Accessed 10 ju l. 2017).

Leach, P., Franks, j ., Luotonen, A., Hallam-Baker, P., Lawrence, S., Hostetler, j.and Stewart, L. (2017). RFC 2617- HTIP Authentication: Basic and Digest Access Authentication. [online] Too ls.ietf.org. Availab le at: https:/tools.ietf.org/html/rfc2617 [Accessed 11 jul. 2017].

Leach. P .. Franks. j., Luotonen. A., Hallam-Baker. P., Lawrence. S .. Hostetler. j . and Stewart. L. (2017). RFC 2617- HTIP Authentication: Basic and Digest Access Authentication. [online) Too ls.ietf.org. Availab le at: https:/tools.ietf.org/html/rfc2617 [Accessed 11 jul. 2017].

Leszcz, M. (2017). The Foundation of Internet Identity I OpeniD. [on line] Openid.net. Ava ilable at: http:l/openid.net/2016/09/27/the-fou n dation-of• internet-identity [Accessed 11 jul. 2017 ).

Pub lisher So lutions International {2017). The IP Registry-The Globa iiP Address Database. [on line] Theipregistry.org. Availab le at: http:/theipregistry.org [Accessed 11 jul. 2017].

REFEDS (2017). Federations Map. [image] Ava ilable at: https:/refeds.org/federations/federations-map [Accessed 11 jul. 2017 ).

Winnard, K., Bussche, M., Choi, W. and Ross i, D. (2016). Managing Digital Certificates across the Enterprise. [S.I.]: IBM Redbooks, p.16.

Wright. C., Freedman, B. and Liu, D. (2008). The IT regu latory and standards compl iance handbook. Burlington, MA: Syngress Pub., pp.522-523 .

Yang. R .. Lau. W. and Liu. T. (2017). Sign ing into One Billion Mob ile App Accounts Effortlessly with 0Auth2.0. [ebook) Avai lable at: https:/www.blackhat.com/docs/eu-16/materials/eu-16- Yang-Signing-lnto• Billion-Mobile-Apps-Effortlessly-With-0Auth20-wp.pdf [Accessed 11 jul. 2017).

**Open Access** This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.